Setup. Algorithm \(f(x_1, x_2) = x_1\) trained on data where \(X_1 \equiv X_2\).
Conditional SHAP attributes non-zero importance to \(X_2\).
Marginal SHAP correctly gives \(\varphi_2 = 0\).
Question. Should we have preferred Marginal SHAP all along?
The function-to-explain can be viewed as a causal graph.
(draw causal graph with X and \(\tilde{X}\))
This distinguishes between observed data \(\tilde{X}\) and function inputs \(X\).
When we set \(X_1 = x_1\) and let other inputs vary, how does \(Y\) change? It depends on how we vary the other inputs.
Observational (seeing). \(\mathbf{E}[Y \mid X_1 = x_1]\)
“Among data with \(X_1 = x_1\), what is average \(Y\)?”
Interventional (doing). \(\mathbf{E}[Y \mid \text{do}(X_1 = x_1)]\)
“If we force \(X_1 = x_1\) for each of this function’s inputs, what is \(Y\)?”
These differ when features are correlated in training data.
Consider:
Z
/ \
↓ ↓
X₁ → Y ← X₂
Backdoor formula. \[\mathbf{E}[Y \mid \text{do}(X_1 = x_1)] = \int \mathbf{E}[Y \mid x_1, x_2] p(x_2) \, dx_2\]
The do-operator deletes incoming edges to \(X_1\). Edge \(Z \to X_1\) is severed.
Contrast this with the formula for usual conditional expectation.
\[\mathbf{E}[Y \mid \text{do}(X_1 = x_1)] = \int \mathbf{E}[Y \mid x_1, x_2] p(x_2) \, dx_2\]
\[\mathbf{E}[Y \mid X_1 = x_1] = \int \mathbf{E}[Y \mid x_1, x_2] p(x_2 \vert x_{1}) \, dx_2\]
Algorithm \(f\) has no confounders:
(draw causal graph with X and \(\tilde{X}\))
\[\mathbf{E}[Y \mid \text{do}(X_S = x_S)] = \int f(x_S, x_{\bar{S}}) p(x_{\bar{S}}) \, dx_{\bar{S}}\]
No backdoor paths exist, so no conditioning is needed and this is the marginal expectation.
Algebraic. \[\mathbf{E}[f(x_S, X_{\bar{S}})] = \int f(x_S, x_{\bar{S}}) p(x_{\bar{S}}) \, dx_{\bar{S}}\]
Causal.
Intervention \(\text{do}(X_S = x_S)\) deletes any edges into \(X_S\). Distribution \(p(x_{\bar{S}})\) unchanged.
Computational.
Algebraic. \[\mathbf{E}[f(x_S, X_{\bar{S}}) \mid X_S = x_S] = \int f(x_S, x_{\bar{S}}) p(x_{\bar{S}} \mid x_S) \, dx_{\bar{S}}\]
Causal. Asks “what do we see when \(X_S = x_S\)?” not “what happens when we set \(X_S = x_S\)?”
Computational.
Janzig et al. (2020) argue that we should focus on the “Algorithm” causal structure.
Data generation (real world).
Z̃ (confounders)
/ \
↓ ↓
X̃₁ X̃₂
How do real-world features relate?
Algorithm (computation).
X₁ → Y = f(X)
X₂ →
How does the prediction function process inputs?
Janzing et al. (2020): - \(\tilde{X}_j\): real-world feature - \(X_j\): algorithm input
Usually \(X_j = \tilde{X}_j\), but they are distinct objects with distinct causal graphs.
Reality. Cannot change \(\tilde{X}_1\) without affecting \(\tilde{X}_2\) (shared causes).
Algorithm. Can evaluate f(x_1, x_2) for any values.
SHAP explains \(f\), so arguably should use the algorithm’s causal structure.
Setup. \(f(x_1, x_2) = x_1\) where \(X_1 \equiv X_2\) in data.
Marginal. \[v^{\text{marg}}(\{2\}) = \int x_1 p(x_1) dx_1 = \mathbf{E}[X_1]\]
Conditional. \[v^{\text{cond}}(\{2\}) = \int x_1 p(x_1 \mid x_2) dx_1 = x_2\]
Conditional approach “sees” correlation. Marginal approach intervenes in the algorithm.
Algorithm: \(\hat{y} = \hat{\alpha}_0 + \hat{\alpha}_1 x_1 + \hat{\alpha}_2 x_2\)
True causal effect of \(X_2\) on \(Y\): zero.
Changing x2 (while sampling x1 independently) does not affect predictions.
Correctly identifies \(X_2\) as irrelevant.
Observing x2 = 1 implies x1 ≈ 1, so x2 appears predictive.
Correlation \(\neq\) causal relevance in the algorithm.
Linear models: \(\varphi_j = \alpha_j (x_j - \mathbf{E}[X_j])\)
Since \(\alpha_1 = 0\), we have \(\varphi_1 = 0\) for all observations.
Algorithm does not use \(X_1\).
Small error due to finite sampling. Correctly near zero.
Larger error. Attributes importance to \(X_1\) due to correlation with other features.
Marginal sampling will extrapolate – we test the algorithm on all possible inputs.